266 ◾ Bioinformatics
know that to execute any QIIME2 command, you must use “qiime”. We also mentioned
above that “tools” is a command and command or plugin may have methods or visualizers
or both. The “import” above is a tools command method whose function is to import any
input data. However, for each input file format, different arguments and options will be
used. In general, importing an input file into an artifact will follow this format:
qiime tools import \
--type xxx
#raw data type
--input-path xxx
#the directory
--input-format xxx
#the raw data format
--output-path xxx
#where artifact file is to be saved
The “import” options are self-explanatory, where “--type” specifies the type of the data to
be imported, “--input-path” specifies the path to the raw data files, “--input-format” speci-
fies the input file format, and “--output-path” specifies the path where the artifact file will
be saved.
You must specify the right data type and format on “--type” and “--input-format” for
your input files to be imported successfully.
In the following, we will show you the formats for importing different output files, but
for practice and detailed information, visit “https://docs.qiime2.org/”.
We will discuss the import of the common types of raw data below.
7.3.1.1.1 Importing Raw Data in FASTA
If the raw data is in FASTA format, all sequences must be in a single FASTA file consisting
of exactly two lines for each sequence record; a line for the definition line and a single line
for sequence (not in multiple lines). The ID in each in the definition line must be as per the
FASTA specification (no white space between the “>” and the ID). Moreover, the ID must
follow the “SAMPLEID_SEQID” format, where SAMPLEID is a unique sample identifier
and SEQID is a sequence identifier. Assuming that your sequences are in the “sequences.
fasta” file, you can use the following “qiime tools import” to create an artifact for the input
file:
qiime tools import \
--type ‘FeatureData[Sequence]’ \
--input-path sequences.fna \
--output-path sequences.qza
The above command will import the FASTA sequences as a QIIME2 artifact “sequences.
qza” that is ready for the next step of the analysis.
7.3.1.1.2 Importing EMP-Multiplexed FASTQ Reads
The EMP-multiplexed FASTQ files contain multiplexed reads, but the barcode sequences
are in a separate file. Therefore, there will be two files (forward FASTQ file and barcode
FASTQ file) for single-end reads and three files for paired-end reads (forward, reverse,